The false discovery rate for statistical pattern recognition

نویسندگان

  • Clayton Scott
  • Gowtham Bellala
  • Rebecca Willett
چکیده

Abstract: The false discovery rate (FDR) and false nondiscovery rate (FNDR) have received considerable attention in the literature on multiple testing. These performance measures are also appropriate for classification, and in this work we develop generalization error analyses for FDR and FNDR when learning a classifier from labeled training data. Unlike more conventional classification performance measures, the empirical FDR and FNDR are not binomial random variables but rather a ratio of binomials, which introduces challenges not addressed in conventional analyses. We develop distribution-free uniform deviation bounds and apply these to obtain finite sample bounds and strong universal consistency.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The False Discovery Rate in Simultaneous Fisher and Adjusted Permutation Hypothesis Testing on Microarray Data

Background and Objectives: In recent years, new technologies have led to produce a large amount of data and in the field of biology, microarray technology has also dramatically developed. Meanwhile, the Fisher test is used to compare the control group with two or more experimental groups and also to detect the differentially expressed genes. In this study, the false discovery rate was investiga...

متن کامل

Pattern Recognition in Control Chart Using Neural Network based on a New Statistical Feature

Today for the expedition of the identification and timely correction of process deviations, it is necessary to use advanced techniques to minimize the costs of production of defective products. In this way control charts as one of the important tools for the statistical process control in combination with modern tools such as artificial neural networks have been used. The artificial neural netw...

متن کامل

Global Testing and Large-Scale Multiple Testing for High-Dimensional Covariance Structures

Driven by a wide range of contemporary applications, statistical inference for covariance structures has been an active area of current research in high-dimensional statistics. This paper provides a selective survey of some recent developments in hypothesis testing for high-dimensional covariance structures, including global testing for the overall pattern of the covariance structures and simul...

متن کامل

Determining Hit Rate in Pattern Search

The problem of spurious apparent patterns arising by chance is a fundamental one for pattern detection. Classical approaches, based on adjustments such as the Bonferroni procedure, are arguably not appropriate in a data mining context. Instead, methods based on the false discovery rate the proportion of flagged patterns which do not represent an underlying reality may be more relevant. We descr...

متن کامل

A New Statistical Approach for Recognizing and Classifying Patterns of Control Charts (RESEARCH NOTE)

Control chart pattern (CCP) recognition techniques are widely used to identify the potential process problems in modern industries. Recently, artificial neural network (ANN) –based techniques are very popular to recognize CCPs. However, finding the suitable architecture of an ANN-based CCP recognizer and its training process are time consuming and tedious. In addition, because of the black box ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992